The File Mover: high-performance data transfer for the grid

نویسندگان

  • Cosimo Anglano
  • Massimo Canonico
چکیده

The exploration in many scientific disciplines (e.g., High-Energy Physics, Climate Modeling, and Life Sciences) involves the production and the analysis of massive data collections, whose archival, retrieval, and analysis require the coordinated usage of high capacity computing, network, and storage resources. To obtain satisfactory performance, these applications require the availability of a high-performance, reliable data transfer mechanisms, able to minimize the data transfer time that often dominates their execution time. In this paper we present the File Mover, an efficient data transfer system specifically tailored to the needs of data-intensive applications, that exploits the overlay networks paradigm to provide superior performance with respect to conventional file transfer systems. An extensive experimental evaluation, carried out by means of a proof-of-concept implementation of the File Mover for a variety of network scenarions, shows the ability of the File Mover to outperform alternative data transfer systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

A Framework for Data Management and Transfer in Grid Environments

The main obstacles to grid file management come from the fact that grid file resources are typically stored in heterogeneous and distributed environment and accessed through various protocols. In this paper, we propose a grid file management system called Vega [1][2] Hotfile2 for data-intensive application in widely distributed systems and grid environments. Widely distributed and heterogeneous...

متن کامل

JPARSS: A Java Parallel Network Package for Grid Computing

The emergence of high speed wide area networks makes grid computing a reality. However grid applications that need reliable data transfer still have difficulties to achieve optimal TCP performance due to network tuning of TCP window size to improve bandwidth and to reduce latency on a high speed wide area network. This paper presents a Java package called JPARSS (Java Parallel Secure Stream (So...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Concurrency and Computation: Practice and Experience

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2008